Parallel Computation of Entries of A-1

نویسندگان

  • Patrick Amestoy
  • Iain S. Duff
  • Jean-Yves L'Excellent
  • François-Henry Rouet
چکیده

In this paper, we are concerned about computing in parallel several entries of the inverse of a large sparse matrix. We assume that the matrix has already been factorized by a direct method and that the factors are distributed. Entries are efficiently computed by exploiting sparsity of the right-hand sides and the solution vectors in the triangular solution phase. We demonstrate that in this setting, parallelism and computational efficiency are two contrasting objectives. We develop an efficient approach and show its efficacy by runs using the MUMPS code that implements a parallel multifrontal method. Key-words: sparse matrices, matrix inverse, direct methods, direct solver, parallelism ∗ Also available as CERFACS, ENSEEIHT-IRIT, and RAL reports † Université de Toulouse, INPT(ENSEEIHT)-IRIT, France ({amestoy,frouet}@enseeiht.fr). ‡ CERFACS, 42 Avenue Gaspard Coriolis, 31057, Toulouse, France ([email protected]). § R 18, RAL, Oxon, OX11 0QX, England ([email protected]). ¶ INRIA and Université de Lyon, Laboratoire LIP (UMR 5668 – CNRS, ENS Lyon, INRIA, UCBL), France ([email protected]). ha l-0 07 59 55 6, v er si on 2 21 D ec 2 01 2 Calcul d’entrées de A-1 en parallèle Résumé : Nous considérons le calcul en parallèle de plusieurs entrées de l’inverse d’une matrice creuse de grande taille. Nous supposons que la matrice a été factorisée à l’aide d’une méthode directe et que les facteurs creux de la matrice sont distribués sur les processeurs. Les entrées de l’inverse peuvent être calculées efficacement en exploitant le fait que les vecteurs constituant les seconds membres et la solution sont creux. Nous montrons que dans ce contexte, le parallélisme et l’efficacité du calcul sont deux objectifs contradictoires. Nous développons une approche efficace et montrons son intérêt grâce à des tests utilisant le code MUMPS, qui implante une méthode multifrontale parallèle pour machines à mémoire distribuée. Mots-clés : matrice inverse, matrices creuses, méthodes directes, parallélisme ha l-0 07 59 55 6, v er si on 2 21 D ec 2 01 2 Parallel computation of entries of A-1 3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel computation framework for optimizing trailer routes in bulk transportation

We consider a rich tanker trailer routing problem with stochastic transit times for chemicals and liquid bulk orders. A typical route of the tanker trailer comprises of sourcing a cleaned and prepped trailer from a pre-wash location, pickup and delivery of chemical orders, cleaning the tanker trailer at a post-wash location after order delivery and prepping for the next order. Unlike traditiona...

متن کامل

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...

متن کامل

Dexterous Workspace Shape and Size Optimization of Tricept Parallel Manipulator

This work intends to deal with the optimal kinematic synthesis problem of Tricept parallel manipulator. Observing that cuboid workspaces are desirable for most machines, we use the concept of effective inscribed cuboid workspace, which reflects requirements on the workspace shape, volume and quality, simultaneously. The effectiveness of a workspace is characterized by the dexterity of the manip...

متن کامل

Optimization of Agricultural BMPs Using a Parallel Computing Based Multi-Objective Optimization Algorithm

Beneficial Management Practices (BMPs) are important measures for reducing agricultural non-point source (NPS) pollution. However, selection of BMPs for placement in a watershed requires optimizing available resources to maximize possible water quality benefits. Due to its iterative nature, the optimization typically takes a long time to achieve the BMP trade-off results which is not desirable ...

متن کامل

An Efficient Algorithm for Workspace Generation of Delta Robot

Dimensional synthesis of a parallel robot may be the initial stage of its design process, which is usually carried out based on a required workspace. Since optimization of the links lengths of the robot for the workspace is usually done, the workspace computation process must be run numerous times. Hence, importance of the efficiency of the algorithm and the CPU time of the workspace computatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Scientific Computing

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2015